Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use descriptive container names #4670

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Conversation

pat-s
Copy link
Contributor

@pat-s pat-s commented Jan 5, 2025

supersedes #4245

  • Shorten the UUID to shorten container names (also makes it easier to spot pods from the same workflow)
  • Add step name to container name
  • Add workflow name to container name
  • Improve DNS name checker function

The UUID needs to stay as there can be multiple parallel runs of the same workflow which must have distinct names.

@pat-s pat-s added agent feature add new functionality labels Jan 5, 2025
@pat-s pat-s changed the title Use descriptive names for container names Use descriptive container names Jan 5, 2025
Copy link

codecov bot commented Jan 5, 2025

Codecov Report

Attention: Patch coverage is 37.28814% with 74 lines in your changes missing coverage. Please review.

Project coverage is 28.15%. Comparing base (c19723e) to head (5873d01).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
pipeline/backend/kubernetes/kubernetes.go 0.00% 15 Missing ⚠️
pipeline/backend/docker/docker.go 0.00% 13 Missing ⚠️
pipeline/backend/local/local.go 0.00% 11 Missing ⚠️
pipeline/pipeline.go 0.00% 11 Missing ⚠️
pipeline/backend/kubernetes/pod.go 68.42% 6 Missing ⚠️
pipeline/backend/kubernetes/secrets.go 54.54% 5 Missing ⚠️
agent/runner.go 0.00% 4 Missing ⚠️
pipeline/backend/kubernetes/service.go 50.00% 4 Missing ⚠️
pipeline/backend/dummy/dummy.go 78.57% 2 Missing and 1 partial ⚠️
pipeline/backend/kubernetes/utils.go 80.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4670      +/-   ##
==========================================
- Coverage   28.16%   28.15%   -0.02%     
==========================================
  Files         398      398              
  Lines       28205    28220      +15     
==========================================
  Hits         7944     7944              
- Misses      19555    19568      +13     
- Partials      706      708       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

func toContainerName(step *types.Step) string {
return "wp_" + step.UUID
func toContainerName(step *types.Step, workflowName string) string {
return "wp_" + workflowName + "-" + step.Name + "-" + step.UUID[:5]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of UUID are we using? Some have that first part is a timestamp component so that could be a problem, not sure if it is here tho

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be a random uuid

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems indeed that the first part is somehow static, at least on the last part changes on "restart" runs of the same pipeline.

Hence I will change it to the last 5 digits of the generated UUID.

Copy link
Contributor

@qwerty287 qwerty287 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we shorten the uuid at all?

AFAIK the uuid is only used for this, so there wouldn't be a need to generate that long ones.

agent/runner.go Outdated Show resolved Hide resolved
@pat-s
Copy link
Contributor Author

pat-s commented Jan 6, 2025

Why do we shorten the uuid at all?

IMO long UUIDs make all outputs a bit more messy for no real benefit. A 5-char one should suffice and do "the job".

However, I just noticed that we could actually drop the UUID at all. I thought that a unique one will be created for each pipeline, i.e. also when using "restart". This is not the case and hence such actions will conflict with a previous pipeline if "auto cancel on push" is off.
This is actually a separate issue and unrelated to this change, but again, I think the UUID could be dropped entirely.

@pat-s
Copy link
Contributor Author

pat-s commented Jan 7, 2025

Update to my previous comment: it seems that the UID is identical for pipelines on the first x digits but differs on the latter ones (e.g. from mid to end). Not sure why that is but this means instead of using the first 5 chars we should use the last 5 ones for a unique pod name.

This is important to be able to restart pipelines which don't automatically cancel the previous one without getting a conflict in the container name.

@pat-s pat-s requested review from qwerty287 and lafriks January 7, 2025 08:06
@qwerty287
Copy link
Contributor

qwerty287 commented Jan 7, 2025

Maybe we should just use a simple random char generator instead of a specific one for uuids? This way we can make sure that it's completely different and we also don't have to generate the full uuid if we don't needed it anyways.

@lafriks
Copy link
Contributor

lafriks commented Jan 7, 2025

Maybe you could give a use case for this? I have a server running quite some pipelines daily and we still had no case for the need to check running pipelines in containerization side. Also workflow name&step name still does not sound very useful as they will most likely be very similar if not the same between multiple repos (build, deploy etc)

@pat-s
Copy link
Contributor Author

pat-s commented Jan 7, 2025

  1. Objectively I see no reason to not have descriptive container names
  2. I am using WP extensively in a way that I have > 10 parallel pipelines running for days. These sometimes get stuck and also always exceed the log limit in the UI, i.e. I have to check the logs individually in the containers. This is not an issue on k8s, however the main issue is that I can't directly see which matrix run I am looking at

Also workflow name&step name still does not sound very useful as they will most likely be very similar if not the same between multiple repos

Why not? And if you state so, it would be good to also name an alternative. I don't see any. And using a random ID as of now is certainly not better. The only information we have about a pipeline it the name and the enclosing step names. Users are responsible for the names, so it is up to them to use descriptive ones that don't repeat themselves everywhere.

@pat-s
Copy link
Contributor Author

pat-s commented Jan 9, 2025

@qwerty287 @lafriks Anything left?

@lafriks
Copy link
Contributor

lafriks commented Jan 9, 2025

Personally I don't see use for this but I don't really care how containers are named as long as they don't conflict (reason why we moved away from similar naming previously)

Imho better use would be to use labels to add metadata about org/repo/workflow/step for container

@pat-s
Copy link
Contributor Author

pat-s commented Jan 9, 2025

As said, if you have dozens of builds running over days you value descriptive container names.
Aside from this, this might already be helpful if one runs a large matrix workflow and target a specific container to possibly exec in.

Imho better use would be to use labels to add metadata about org/repo/workflow/step for container

This is something different and doesn't help when inspecting pods interactively and wanting to know which one does what. But it can surely be added in a different PR, more metadata is always better.

@xoxys
Copy link
Member

xoxys commented Jan 10, 2025

To be honest my personal opinion is debugging in running pods is the wrong way in any case. Besides that Im pretty sure uuids are not guaranteed to be unique anymore if you are shortening them randomly.

This is something different and doesn't help when inspecting pods

Why? Just use label selectors to get your pod why do we need this information in the name?

@pat-s
Copy link
Contributor Author

pat-s commented Jan 10, 2025

To be honest my personal opinion is debugging in running pods is the wrong way in any case.

It is not necessarily primarily about "debugging". As said, I have builds running for days sometimes (the use case doesn't matter) and logs in the UI will stop streaming at some point.

It is essential to me at some point to check interactively what is happening in the pods. I do this in k9s and just filtering the pods by fuzzy search and then selecting the one I want. I don't wanna check on the metadata first, then exit again, open the log stream and continue (especially as I don't want to have to remember the exact namings upfront).
In this scenario sometimes > 20 WP pods are active, most are from matrix builds which even have similar metadata in many ways.

Having the workflow and step name in the name is really not hurting but only helps (?).

I can't understand the discussion around this actually, especially as you both argue like "I never do/need this and don't care about pod names" but at the same time keep arguing about it without approval ❓️


uuids are not guaranteed to be unique anymore if you are shortening them randomly

They are generated uniquely now and are not shortened anymore.

According to math calculators, the collision probability of two consecutive strings being identical is
0.00000165 % (with lowercase chars and numbers for a 5-length random string.)
And yes, I know that there are more pipelines potentially running in parallel and this will increase the likelihood - but I still think we won't see that case ever in our lifes. (I am also happy to make if a 6-char string...)

@xoxys
Copy link
Member

xoxys commented Jan 10, 2025

Alright, then I will step back from reviews of features I dont care about? My opinion is, adding meaningful magic to container names (in this case container names can get very long due to all the metadata controlled by users) is wrong and could lead to issues and should therefor be avoided. One example moby/moby#46928

Adding feedback even if I dont have a use case is an essential part of the review process. However I have not blocked this PR so if another maintainer is fine with this approach it can still be approved and merged.

@pat-s
Copy link
Contributor Author

pat-s commented Jan 10, 2025

Alright, then I will step back from reviews of features I dont care about?

Cmon, I am just arguing with the same enthusiasm as you often do...no hard feeling ;)

ould lead to issues and should therefor be avoided. One example moby/moby#46928

Alright, this is a valid point in this discussion. I thought about this as well and the PR already contains this:

	const maxPodNameLength = 253
	if len(res) > maxPodNameLength {
		res = res[:maxPodNameLength]
	}

Adding feedback even if I dont have a use case is an essential part of the review process. However I have not blocked this PR so if another maintainer is fine with this approach it can still be approved and merged.

And this is very welcome, I asked for it!
With "blocking" I meant not actively approving which is an effective semi-block in the end, I'd say? Difficult topic 😄️

Okay, last argument: I wouldn't have opened this PR, changed all these files and modified the generic if this "was just a cosmetic change" for me. While I agree my use case for this not being super standard, I might surely not be the only one out there.

@xoxys
Copy link
Member

xoxys commented Jan 10, 2025

Alright, this is a valid point in this discussion. I thought about this as well and the PR already contains this:

But this logic only applies to the Kubernetes backend and as the UUID is the last part of the name it is truncated in the worst case, leading to duplicate name issues.

Besides that, I can see even more issues with the following criteria https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#dns-subdomain-names How do you want to ensure these criteria? Edit: I see you have added a regex already, but I have still some questions:

  • This means this change is breaking, as users will be very limited in naming steps now?
  • As it can't be disabled, even users that don't need this feature cant use special chars in step names anymore?
  • What happens with other languages? Cyrillic or Chinese step names will not work anymore?
  • What happens with other container backends? Docker has no filtering/sanitizing added for now?

Example:

image

I still believe this is the wrong direction and not achievable without adding a lot of complexity.

@xoxys
Copy link
Member

xoxys commented Jan 11, 2025

Can we not use an opposide aprroach? What about adding the required metadata to the pipeline/step? We could store the agent and container/pod information and provide an API (and maybe cli/ui helper) to get those information. Drone has implemented a different approach https://docs.drone.io/runner/docker/configuration/reference/drone-tmate-enabled/

@pat-s
Copy link
Contributor Author

pat-s commented Jan 11, 2025

Can't we just put this behind a server env var and let the admin decide? All it takes in the code is then a one line if condition on how container names are constructed.

@xoxys
Copy link
Member

xoxys commented Jan 11, 2025

Sure we can but this has still a lot of disadvantages as it massively limits the possible steb names for all users. E.g. this means und public instances like codeberg users cant even use their native languages for step names anymore. Long story short, for me this is devinetly the wrong way, sorry.

@pat-s
Copy link
Contributor Author

pat-s commented Jan 11, 2025

I wouldn't set it as the default.

Still, I need this so heavily for my workflows that I will have to use a forked version forever then.

massively limits the possible steb names for all users.

I doubt that so many emoji-based stepnames exists. And we can add a warning to the setting with its limitations.

public instances like codeberg

I wouldn't deploy/enable it there as users don't have access to the containers anyhow.

@xoxys
Copy link
Member

xoxys commented Jan 11, 2025

I have also made multiple suggestions for alternative solutions. I see your use case and dont say we should provide a solution.

I doubt that so many emoji-based stepnames exists. And we can add a warning to the setting with its limitations.

Its not only about emojies not even chines or cyrillic step names would work

@pat-s
Copy link
Contributor Author

pat-s commented Jan 11, 2025

I have also made multiple suggestions for alternative solutions. I see your use case and dont say we should provide a solution.

So what about my previous suggestion with adding it as an opt-in setting with a warning in the docs?

Its not only about emojies not even chines or cyrillic step names would work

Maybe there is a way to "sanitize" such characters automatically somehow. But imo this is also somewhat of an edge case I'd argue, if one really opts-in and then uses non-latin characters in step names...

Not sure if WP needs to account for every edge case like this in a opt-in setting then? (but overall good point I haven't thought about).

@xoxys
Copy link
Member

xoxys commented Jan 11, 2025

I really think this approach is wrong regardless of opt-in, sorry. But maybe someone else disagrees.

@pat-s
Copy link
Contributor Author

pat-s commented Jan 11, 2025

Can't understand your final reasoning, especially when adding it as an opt-in and outlining the potential downsides in the docs.

I understand the general doubts and potential downsides and you convinced me not setting it as the default, but I am somewhat disappointed that even an opt-in is denied (as I can't understand the reasons for this argument).

The metadata workaround doesn't help me in any way for the outlined use case.

@xoxys
Copy link
Member

xoxys commented Jan 11, 2025

Then I maybe dont understand your use case good enough. My understanding: There are long running pipelines and you need to check the container logs. To do so you need to know which container is executing the step you want to inspect. This can be done by descriptive container names. But it should also be possible to lookup the container name by e.g. woodpecker-cli pipeline ps which displays all steps of a pipeline (already implemented) incl agent and container metadata. This way you get a step <-> container name relation as well without the downsides of the descriprive container names.

@lafriks
Copy link
Contributor

lafriks commented Jan 14, 2025

Still I don't understand why not use labels that could be assigned by agent to containers/pods when creating them? You could easily use that to find needed pod by just using label filter ex. kubectl get pods -l 'owner=org, repository=repo, workflow=frontend, step=build' and this way find exact pod that you are searching for even way better than by name, similarly docker ps can filter by labels

You can also specify kubectl get pods --show-labels to identify easily in list what is what

@xoxys
Copy link
Member

xoxys commented Jan 15, 2025

Labels doesn't work/are very limited for the docker backend.

Edit: Maybe Im wrong. Looks like labels can be filtered with docker ps --filter label=* now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent feature add new functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants